NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

A Guided Topic-Noise Model for Short Texts

https://doi.org/10.1145/3485447.3512007

Churchill, Robert; Singh, Lisa; Ryan, Rebecca; Davis-Kean, Pamela (April 2022, WWW '22: Proceedings of the ACM Web Conference 2022)

Researchers using social media data want to understand the discussions occurring in and about their respective fields. These domain experts often turn to topic models to help them see the entire landscape of the conversation, but unsupervised topic models often produce topic sets that miss topics experts expect or want to see. To solve this problem, we propose Guided Topic-Noise Model (GTM), a semi-supervised topic model designed with large domain-specific social media data sets in mind. The input to GTM is a set of topics that are of interest to the user and a small number of words or phrases that belong to those topics. These seed topics are used to guide the topic generation process, and can be augmented interactively, expanding the seed word list as the model provides new relevant words for different topics. GTM uses a novel initialization and a new sampling algorithm called Generalized Polya Urn (GPU) seed word sampling to produce a topic set that includes expanded seed topics, as well as new unsupervised topics. We demonstrate the robustness of GTM on open-ended responses from a public opinion survey and four domain-specific Twitter data sets.
more » « less
Full Text Available
Parenting online: analyzing information provided by parenting-focused Twitter accounts

https://doi.org/10.1080/15456870.2022.2061713

Ryan, Rebecca; Davis-Kean, Pamela; Bode, Leticia; Krüger, Jule; Mneimneh, Zeina; Singh, Lisa (January 2022, Atlantic Journal of Communication)

This study investigated the content of parenting information shared on social media by identifying the range and frequency of topics shared by parenting-focused accounts on Twitter. Using the Twitter API, a universe of 675,069 tweets were gathered from 74 of the most-followed parenting-focused accounts, or “hubs,” from January 2016 to June 2018. Using a custom, semi-automated topic modeling approach, we identified the topics – and subtopics within topics – parenting hubs shared with their followers and investigated whether any meaningful differences in topical focus existed between accounts targeting mothers versus fathers. Results indicate that over one third of tweets were about Parenting Behavior and nearly one quarter about Health, with Entertainment, School and Motherhood and Fatherhood generally as less tweeted topics. Mother-focused accounts tweeted more about Health than father-focused accounts, which tweeted more than others about Entertainment. Implications for future parenting and social media research are discussed.
more » « less
Full Text Available
Text Analytic Research Portals: Supporting Large-Scale Social Science Research

https://doi.org/10.1109/BigData52589.2021.9671696

Singh, Lisa; Padden, Colton; Davis-Kean, Pamela; David, Rabin; Marwadi, Virinche; Ren, Yiqing; Vanarsdall, Rebecca (December 2021, 2021 IEEE International Conference on Big Data (Big Data))

Large-scale organic data generated from newspapers, social media, television, and radio require an expertise in infrastructure management, data collection, and data processing in order to gain research value from them. We have developed text analytic research portals to help social science researchers who do not have the resources necessary to collect, store, and process these large-scale data sets. Our portals allow researchers to use an intuitive point and click interface to generate variables from large, dynamic data sets using state of the art text mining and learning methods. These timely variables constructed from noisy text can then be used to advance social science research in areas such as political science, economics, public health, and psychology research.
more » « less
Full Text Available
Best practices for addressing missing data through multiple imputation

https://doi.org/10.1002/icd.2407

Woods, Adrienne D.; Gerasimova, Daria; Van Dusen, Ben; Nissen, Jayson; Bainter, Sierra; Uzdavines, Alex; Davis‐Kean, Pamela E.; Halvorson, Max; King, Kevin M.; Logan, Jessica A.; et al (January 2023, Infant and Child Development)

A common challenge in developmental research is the amount of incomplete and missing data that occurs from respondents failing to complete tasks or questionnaires, as well as from disengaging from the study (i.e., attrition). This missingness can lead to biases in parameter estimates and, hence, in the interpretation of findings. These biases can be addressed through statistical techniques that adjust for missing data, such as multiple imputation. Although multiple imputation is highly effective, it has not been widely adopted by developmental scientists given barriers such as lack of training or misconceptions about imputation methods. Utilizing default methods within statistical software programs like listwise deletion is common but may introduce additional bias. This manuscript is intended to provide practical guidelines for developmental researchers to follow when examining their data for missingness, making decisions about how to handle that missingness and reporting the extent of missing data biases and specific multiple imputation procedures in publications.
more » « less
Full Text Available
Early word‐learning skills: A missing link in understanding the vocabulary gap?

https://doi.org/10.1111/desc.13034

Shavlik, Margaret; Davis‐Kean, Pamela E.; Schwab, Jessica F.; Booth, Amy E. (September 2020, Developmental Science)

Socioeconomic status (SES) has been repeatedly linked to the developmental trajectory of vocabulary acquisition in young children. However, the nature of this relationship remains underspecified. In particular, despite an extensive literature documenting young children's reliance on a host of skills and strategies to learn new words, little attention has been paid to whether and how these skills relate to measures of SES and vocabulary acquisition. To evaluate these relationships, we conducted two studies. In Study 1, 205 2.5‐ to 3.5‐year‐old children from widely varying socioeconomic backgrounds were tested on a broad range of word‐learning skills that tap their ability to resolve cases of ambiguous reference and to extend words appropriately. Children's executive functioning and phonological memory skills were also assessed. In Study 2, 77 of those children returned for a follow‐up session several months later, at which time two additional measures of vocabulary were obtained. Using Structural Equation Modeling (SEM) and multivariate regression, we provide evidence of the mediating role of word‐learning skills on the relationship between SES and vocabulary skill over the course of early development.
more » « less
Full Text Available
Next directions in measurement of the home mathematics environment: An international and interdisciplinary perspective

https://doi.org/10.5964/jnc.6143

Hornburg, Caroline Byrd; Borriello, Giulia A.; Kung, Melody; Lin, Joyce; Litkowski, Ellen; Cosso, Jimena; Ellis, Alexa; King, Yemimah A.; Zippert, Erica; Cabrera, Natasha J.; et al (January 2021, Journal of Numerical Cognition)

This article synthesizes findings from an international virtual conference, funded by the United States National Science Foundation, focused on the home mathematics environment (HME). In light of inconsistencies and gaps in research investigating relations between the HME and children’s outcomes, the purpose of the conference was to discuss actionable steps and considerations for future work. The conference was composed of international researchers with a wide range of expertise and backgrounds. Presentations and discussions during the conference centered broadly on the need to better operationalize and measure the HME as a construct—focusing on issues related to child, family, and community factors, country and cultural factors, and the cognitive and affective characteristics of caregivers and children. Results of the conference and a subsequent writing workshop include a synthesis of core questions and key considerations for the field of research on the HME. Findings highlight the need for the field at large to use multi-method measurement approaches to capture nuances in the HME, and to do so with increased international and interdisciplinary collaboration, open science practices, and communication among scholars.
more » « less
Full Text Available

Search for: All records